listenbrainz: Add pagination, play count aggregation, and recording_mbid fix#6484
listenbrainz: Add pagination, play count aggregation, and recording_mbid fix#6484arsaboo wants to merge 6 commits intobeetbox:masterfrom
Conversation
|
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
There was a problem hiding this comment.
Pull request overview
grug see PR try make listenbrainz import correct and less wasteful. It add pagination so not stuck at 25 listens, it aggregate listen events into real play counts, and it use recording_mbid from API mapping to avoid extra MusicBrainz lookup.
Changes:
- paginate ListenBrainz listens (loop pages via
max_ts) - aggregate individual listens into per-track
playcountbefore callingupdate_play_counts - add tests for pagination, aggregation, and
recording_mbidbehavior (plus changelog entry)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
beetsplug/listenbrainz.py |
add pagination, aggregation helper, and recording_mbid fix for play count import |
test/plugins/test_listenbrainz.py |
add tests for new pagination/aggregation and mapping behavior |
docs/changelog.rst |
document ListenBrainz bugfix behavior change |
| if max_ts is not None: | ||
| params["max_ts"] = max_ts | ||
| if min_ts is not None: | ||
| params["min_ts"] = min_ts | ||
|
|
||
| response = self._make_request(url, params) | ||
| if response is None: |
There was a problem hiding this comment.
grug see doc say min_ts and max_ts no mix. but code will send both if caller pass both. grug want guard: raise ValueError (or ui.UserError) when both set, so no weird API result.
| all_listens = [] | ||
|
|
||
| while True: | ||
| params = {"count": per_page} | ||
| if max_ts is not None: | ||
| params["max_ts"] = max_ts | ||
| if min_ts is not None: |
There was a problem hiding this comment.
grug see count become per_page, but no clamp to API max 1000 and no check for <=0. grug want clamp to 1000 and reject non-positive so request not fail or loop strange.
| found_total += found | ||
| unknown_total += unknown | ||
| """Obtain play counts from ListenBrainz.""" | ||
| listens = self.get_listens() |
There was a problem hiding this comment.
grug see when request fail, get_listens break and return empty list. then _lbupdate log "No listens found" like user have none, but really API error. grug want error path distinct: log warning/error and bail, or return None/raise so caller can tell.
| listens = self.get_listens() | |
| listens = self.get_listens() | |
| if listens is None: | |
| # grug: error path different from "no listens" path | |
| log.error("Failed to fetch listens from ListenBrainz; aborting import.") | |
| return |
beetsplug/listenbrainz.py
Outdated
| # Use the oldest listen's timestamp for the next page | ||
| max_ts = listens[-1]["listened_at"] | ||
|
|
||
| return all_listens | ||
|
|
||
| def get_tracks_from_listens(self, listens) -> list[Track]: | ||
| """Returns a list of tracks from a list of listens.""" |
There was a problem hiding this comment.
grug worry pagination no progress guard. max_ts set to oldest listened_at, but if API treat max_ts inclusive or many listens share same second, next page can repeat and maybe loop forever / overcount. grug want make next max_ts strictly older (ex minus 1) or detect max_ts not decreasing and break.
|
|
||
| # If we got fewer than requested, we've reached the end | ||
| if len(listens) < per_page: | ||
| break | ||
|
|
There was a problem hiding this comment.
grug see all_listens grow big then later convert to tracks then aggregate. for big user, memory boom. grug want stream pages: aggregate Counter as pages come, no keep every listen dict in list.
beetsplug/listenbrainz.py
Outdated
| play_counts: Counter[str] = Counter() | ||
| track_info: dict[str, Track] = {} | ||
| for t in tracks: | ||
| mbid = t.get("mbid") or "" | ||
| artist = t["artist"] | ||
| name = t["name"] | ||
| album = t.get("album") or "" | ||
|
|
||
| key = mbid if mbid else f"{artist}||{name}||{album}" |
There was a problem hiding this comment.
grug see aggregation key build with string join and '||'. if artist/title/album contain '||' then collision and wrong counts. grug want use tuple key (mbid) or (artist,name,album) instead of string concat.
| play_counts: Counter[str] = Counter() | |
| track_info: dict[str, Track] = {} | |
| for t in tracks: | |
| mbid = t.get("mbid") or "" | |
| artist = t["artist"] | |
| name = t["name"] | |
| album = t.get("album") or "" | |
| key = mbid if mbid else f"{artist}||{name}||{album}" | |
| play_counts: Counter[str | tuple[str, str, str]] = Counter() | |
| track_info: dict[str | tuple[str, str, str], Track] = {} | |
| for t in tracks: | |
| mbid = t.get("mbid") or "" | |
| artist = t["artist"] | |
| name = t["name"] | |
| album = t.get("album") or "" | |
| if mbid: | |
| key: str | tuple[str, str, str] = mbid | |
| else: | |
| key = (artist, name, album) |
| }, | ||
| { | ||
| "mbid": "m2", | ||
| "artist": "B", |
There was a problem hiding this comment.
grug see new test lines very long (nested dict in one line). repo Ruff check E501 unless file ignored, and this file not ignored. grug want wrap these dict literals across lines so lint pass.
| "name": "T", | ||
| "album": "Bl", | ||
| "playcount": 1, | ||
| }, |
There was a problem hiding this comment.
grug see same long-line problem for page2 listen dict too. please wrap line to stay under configured max line length so Ruff not yell.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6484 +/- ##
==========================================
+ Coverage 70.01% 70.10% +0.08%
==========================================
Files 146 146
Lines 18505 18545 +40
Branches 3010 3020 +10
==========================================
+ Hits 12957 13001 +44
+ Misses 4920 4911 -9
- Partials 628 633 +5
🚀 New features to boost your workflow:
|
|
@arsaboo could you resolve the Copilot comments that you've addressed? |
Follow-up to #6471 — fixes three remaining issues with the
listenbrainzplugin:playcount: 1. Without aggregation, the finallistenbrainz_play_countis always 1 regardless of actual listens.max_tsuntil all listens are retrieved.recording_mbidfrommbid_mappingwhen present. Previouslymbidwas left asNonewhen the mapping existed, falling back to expensive MB API lookups unnecessarily.Fixes #6469 (remaining issues after #6471)
docs/changelog.rstto the bottom of one of the lists near the top of the document.)